@中国诗情感分析Sentiment Analysis on Classical Chinese Poetry

本篇论文旨在提高对古典诗歌情感分析的性能,并通过引入增强层次注意力和短行情感标签的多任务框架来实现这一目标。该方法利用了组成诗歌的独特信息,并将情感分析应用于整体诗歌和短行。实验结果表明,该方法优于现有技术,在准确性和F1 宏得分方面均取得了显著提升。
we propose to utilize the unique information from the individual short lines that compose the poem, and introduce a multi-task framework with hierarchical attention enhanced with short line sentiment labels.

介绍

To analyse sentiment in classical Chinese po-etry, previous studies have explored different methods, for example, constructing sentiment lexicons(Hou and Frank, 2015; Zhang et al., 2023), transferring knowledge from modern Chinese(Zhao et al., 2014), or extracting imagery words(Shen et al., 2019; Su et al., 2023). 【现有的情感分析方法:构建情感词典,从现代汉语翻译中转移知识,抽取意象词】

Although these studies improved the general performance for the task of sentiment analysis in classical Chinese poetry, by utilizing special words in the poems or draw-ing upon knowledge beyond the poems, they did not consider the compositional structure of the po-ems. (没有考虑诗歌的成分结构)Usually, a classical Chinese poem comprises several short lines, which may show different emo-tions, and in return, contribute to the overall emo-tion expression of the poem. (短句可能有不同的情感)Thus in this paper, for the task of sentiment analysis of classical Chinese poetry, we propose to take the sentiment of short lines into consideration by using a multi-task framework with a hierarchical attention network, which includes the sentiment analysis task of both the overall poem and the short lines of which the poem is comprised. We will show that, by leveraging the sentiment information from the short lines, we can outperform the current state-of-the-art in sentiment analysis of ancient Chinese poetry.
F1值细分类

相关工作

1. 通用诗歌情感分析方法

2. 古典中文诗歌的专项研究

3. 多任务学习与分层注意力机制

Aspect-based sentiment analysis(ABSA,基于方面的情感分析)是一种细粒度的情感分析任务,旨在识别句子中针对特定方面(Aspect)的情感极性。

Hierarchical Attention Network(HAN,层级注意力网络)是一种面向长文本(如文档、篇章)分类的深度学习模型,由Zichao Yang等人在2016年提出。其核心思想是通过模仿人类阅读文档时的层次化注意力机制,捕捉不同层次的文本重要性差异。

数据集

Pasted image 20250304142640.png

Thanks to the work of Chen et al. (2019), we now have a fine-grained sentimental poetry corpus (FSPC), which we are going to utilize in our experiments.

方法

Pasted image 20250304142836.png

实验及结果

1. 实验设计

2. 关键结果

实验1:框架复杂度影响
模型配置 Accuracy (%) F1-macro (%)
单任务(仅整体情感分类) 69.00 66.27
多任务(联合短句+整体分类) 69.32 66.50
多任务 + HAN 70.06 67.49
多任务 + HAN + 短句标签增强 70.96 68.51
实验2:预训练模型对比
预训练模型 Accuracy (%) F1-macro (%)
BERT_CCPoem 67.54 65.24
BERT-base-Chinese 69.60 67.33
BERT-ancient-Chinese 70.28 68.31
SikuBERT 70.96 68.51
SikuRoBERTa 72.88 71.05

3. 结果分析

混淆矩阵
Pasted image 20250304144642.png

Pasted image 20250304152711.png

4. 性能突破